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Identification of autoimmune gene signatures 
in autism 

J-Y Jung\ IS Kohane^'^'^ and DP WalP " 

The role of the immune system in neuropsychiatric diseases, including autism spectrum disorder (ASD), has long been 
hypothesized. This hypothesis has mainly been supported by family cohort studies and the immunological abnormalities found 
in ASD patients, but had limited findings in genetic association testing. Two cross-disorder genetic association tests were 
performed on the genome-wide data sets of ASD and six autoimmune disorders. In the polygenic score test, we examined 
whether ASD risk alleles with low effect sizes work collectively in specific autoimmune disorders and show significant 
association statistics. In the genetic variation score test, we tested whether allele-specific associations between ASD and 
autoimmune disorders can be found using nominally significant single-nucleotide polymorphisms. In both tests, we found that 
ASD is probabilistically linked to ankylosing spondylitis (AS) and multiple sclerosis (MS). Association coefficients showed that 
ASD and AS were positively associated, meaning that autism susceptibility alleles may have a similar collective effect in AS. The 
association coefficients were negative between ASD and MS. Significant associations between ASD and two autoimmune 
disorders were identified. This genetic association supports the idea that specific immunological abnormalities may underlie the 
etiology of autism, at least in a number of cases. 

Translational Psychiatry (20^^) 1, e63; doi:10.1038/tp.2011.62; published online 13 December 2011 



Introduction 

Autism spectrum disorder (ASD) is a broad spectrum of early- 
onset neuropsychiatric disorders characterized by severe 
deficits in social interaction and language, and the presence of 
repetitive and stereotyped behaviors and interests.'' Twin 
studies have demonstrated that ASD is largely genetic, with a 
90% concordance between monozygotic twins, and heritable, 
with a 5-10 times higher familial risk than in the general 
population. Genome-wide association studies (GWASs) 
have discovered significant genetic markers of single-nucleo- 
tide polymorphisms (SNPs)""^ and de novo mutations of copy 
number variations''"^ ° that may cause ASD. However, the 
molecular pathology of ASD is largely unclear owing to its 
genetic heterogeneity and the fact that only a small proportion 
of incidence is explained by known susceptibility loci.^^'^^ 
Given this heterogeneity, it has been suggested that cross- 
disease analysis between ASD and other disorders that share 
common phenotypic symptoms or genetically susceptible loci 
will shed light on our understanding of the molecular 
mechanisms underlying ASD.''^ 

The role of the immune system in neuropsychiatric 
diseases, including ASD, has long been hypothesized^"*"^^ 
and is mainly supported by family cohort studies and the 
immunological abnormalities found in autistic patients. For 
example, Atladottir et al}^ reported that the risk of ASD 
increases when a child's mother has rheumatoid arthritis or 
has a family history of type 1 diabetes. Other autoimmune 



disorders for which epidemiological studies have shown 
significant association with ASD, include maternal psoriasis,^® 
maternal ulcerative colitis, and autoimmune thyroid disease 
(ATD).^° In addition, various forms of immune dysregulation, 
including elevated cytokine levels^^"^^ and increased immu- 
noglobulin and serum protein levels, ^"^^^ have been identified 
in autistic children. However, few studies have reported 
potential genetic components that account for associations 
between ASD and autoimmune disorders. In fact, only three 
alleles, two in the human leukocyte antigen (HLA-DR4,^'''^® 
DRIS,^'' HLA-A2^^) and one in the major histocompatibility 
complex (MHC) of chromosome 6 (the G4B null allele^°) have 
a confirmed association with ASD. With the advent of GWAS 
and the availability of large amounts of genotype data from 
both ASD and multiple autoimmune diseases, we now have 
the ability to directly analyze genetic associations across 
these diseases. 

Materials and methods 

Study samples and data quality control. We obtained 
lllumina BeadChip 550K genotype data of 941 multiplex 
families (lllumina Inc., San Diego, OA, USA) with autistic 
children from the Autism Genetic Resource Exchange 
(AGRE).^^ For the phenotype labeling, we followed the 
classification of the Autism Diagnostic Interview — Revised^^ 
included in the AGRE phenotype database, but excluded 
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individuals wliose Autism Diagnostic Interview — Revised 
classification and Autism Diagnostic Observation Schedule^^ 
classification did not agree. We examined each family's data 
to find monozygotic twins, triplets, or quadruplets and elected 
to include only one monozygotic sibling per family, filtering 72 
individuals from the data set. We also removed 92 Individuals 
who were annotated as 'possible non-idlopathic autism' in the 
AGRE phenotype database. These cases include prematurity 
of less than 35 weeks gestation (45 individuals). Fragile X 
syndrome (12 individuals), known chromosomal abnormality 
(10 individuals), and other diagnosed neurogenetic disorders 
(10 Individuals). We further applied a set of quality control 
filters In order to identify a stringent subset of robust SNPs. 
We excluded 32 individuals with genotyping rate <0.95 and 
18510 SNPs with genotyping rate <0.95. We examined 
Mendellan error per each family trio, and removed seven 
individuals for Mendellan error >1 percent of all markers. 
SNPs with a minor allele frequency <0.05 or with a Hardy- 
Weinberg equilibrium exact test P< 0.001 were excluded from 
further analysis. After quality control, the ASD genomic data 
consisted of 1397 affected trios and a total of 470025 SNPs. 
We used the software package PLINK^"* to conduct the 
transmission disequilibrium test (TDT) with all family trios 
passing quality control without identifying subpopulatlon, as 
the TDT Is known to maintain the desired type 1 error rate In 
the presence of population stratification.^^ 

We obtained two groups of GWAS data from the Wellcome 
Trust Case Control Consortium (WTCCC) as target disease 
sets for measuring association with ASD. The smaller set 
(WTCCC4) consisted of 1500 common controls and 1000 
independent cases of ankylosing spondylitis (AS), auto- 
immune thyroid disease, multiple sclerosis (MS), and breast 
cancer. It had 14436 non-synonymous SNPs plus 897 SNPs 
in the major histocompatibility complex region, and was 
genotyped using the lllumina Infinium 15K array. The larger 
set (WTCCC7) In the Affymetrix GeneChip 500K array 
(Affymetrix Inc., Santa Clara, CA, USA) consisted of 3000 
shared controls and 2000 independent cases In seven 
diseases. Including bipolar disorder, coronary artery disease, 
Crohn's disease, hypertension, rheumatoid arthritis, type 1 
diabetes and type 2 diabetes. Shared control samples came 
from two sources: for the smaller set (WTCCC4), control 
samples were taken from the 1958 British Birth Cohort (58C) 
and for the larger set (WTCCC7), 1 500 control samples were 
taken from 58C. The remaining 1500 samples were taken 
from blood donors recruited by the three UK Blood Services 
(UKBS).^®'^*" Although the ages of the former are known and 
past the typical age of onset for the autoimmune conditions 
studied, the individual age information of the UKBS group Is 
not known. The potential variation In ages from this control 
group could Introduce minor classification bias, as some of 
individuals may (or may have) develop the autoimmune 
disorder in the future, as discussed in the main WTCCC7 
article.^'' However, the prevalence of these autoimmune 
disorders Is relatively rare, with a combined prevalence of 
rate of up to 8 percent.^® Therefore, although we do not have 
detailed age information for the UKBS samples, we assume 
that misclassiflcatlon due to age of onset would be rare and 
unlikely to bias our results. We followed the quality control 
steps described In Burton et al.,^^'^^ and further removed 



SNPs with minor allele frequency <0.05 or Hardy-Welnberg 
equilibrium exact test P< 0.001. After quality control steps, 
there were 12700 SNPs remaining in the WTCCC4 set, and 
469557 SNPs were retained for further analysis In the WTCCC7 
set. Across all SNPs passing quality control, we used case- 
control association analysis with the software package PLINK. 

As these data were genotyped using different platforms, we 
cross-referenced SNP Ids and strand Information as follows. 
First we converted all custom, non-reference SNP Ids into 
corresponding reference SNP ids by querying the UCSC 
Genome Browser (http://www.genome.uscs.edu; version May 
2004/NCBI genome build 35). Then we examined all reference 
SNP ids to confirm that they are up-to-date by querying NCBI 
dbSNP (build 132) and removed six SNPs that had multiple 
identifiers. Next, we checked all the allelic and chromosomal 
position information of each data set with the HapMap 
genotype data (CEU founders, release 23), reconfiguring the 
strands when alleles did not match. Mapping SNPs Into the 
corresponding gene regions was done by querying NCBI 
dbSNP (build 1 32). The number of intersection SNPs between 
ASD and WTCCC4 set was 5318, and the number of 
intersection SNPs between ASD and WTCCC7 set was 73331 . 

Polygenic score analysis. To identify genomic asso- 
ciations between diseases, we used two separate and 
complementary approaches, as illustrated In Figure 1. The 
first, the polygenic score (PS) test^^ measured the collective 
effect of disease-associated SNPs from one disease on a 
collection of SNPs from another. The test was designed to 
identify associations between complex multlgenic diseases 
that manifest through a combination of multiple variants of 
small individual effect. We used the PS test here to 
determine If collections of variants from cases with ASD 
appear to correlate with groups of variants from individuals 
with autoimmune disorders. First, we labeled a GWAS data 
set from one disease as the 'source' data set and another 
GWAS from a different disease as the 'target'. From the 
source data set we selected groups of autosomal SNPs that 
corresponded to a range of nominal significance thresholds 
(Pt) as source alleles, and recorded the minor allele and 
odds ratio Information of those selected SNPs. Then, for 
each individual In the target data set, we calculated the PS by 
computing the average number of source alleles that the 
individual had, weighted by the logarithm-of-the-odds-ratio 
(log (odds-ratio)) from the source data set. This polygenic 
score can be considered to be a measure of probabilistic 
similarity for those SNPs In the source disease data set and 
each individual in the target data set. As such, if the source 
and target diseases were highly similar In allelic composition, 
the cases would yield consistently higher scores than the 
controls. If there were multiple target diseases, ones that 
were more closely related to the source disease would yield a 
higher average PS, as demonstrated In International 
Schizophrenia Consortium. To provide this context, we 
ran logistic regression analysis with the PS to predict the 
classification of the target disease. Then, we estimated the 
variance explained In the target disease data by the PS using 
the Nagelkerke's pseudo R-square"*" from a model with the 
PS and covarlates, versus that from a model without the 
score. We took the total number of alleles used to calculate 
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Figure 1 Analysis steps in tiie polygenic score approach (a) and genetic variation score approach (b). In both approaches, P-values and odd-ratios were calculated from 
transmission disequilibrium test (ASD data) or case/control association test (WTCCC data) after quality control steps, (a) From the ASD data set and a given Pvalue threshold 
Pj, a set of SNPs with P< Pj were selected as 'source alleles'. Then for each target disease data set, the polygenic score was calculated for each individual. We conducted 
logistic regression analysis with the polygenic scores in order to examine whether source alleles from ASD data could explain variances in the target disease data sets, (b) For 
each ASD and autoimmune data set, the genetic variation score was calculated for each SNP. We used Pearson's pair-wise correlation to compare the collective effect of 
SNPs in two diseases. 



the PS and the numerically coded site information of 
individuals in the WTCCC data sets as covariates. 



disease sets, as GVS compared the pair-wise, allele-specific 
significance of each SNP. 



Genetic variation score analysis. The second metric we 
used to compare ASD to autoimmune disorders was the 
genetic variance score (GVS).^^ With GVS, a combined 
score of the odds-ratio and P-value was defined for each 
disease data set and every SNP belonging to that set. 
Specifically, for each disease data set d and a SNP sed, a 
GVS of [d, s] was defined as sign (log(odds-ratio [d, s])) * 
(-logio(P-value [d, s])). Given that the odds-ratio was 
calculated with respect to the minor allele of the SNP and 
that the odds-ratio was greater than 1 when the minor allele 
was more likely to occur in the case group, the sign (log 
(odds-ratioj^ was positive when the minor allele was the risk 
allele, and negative when the major allele was the risk allele. 
When the GVS scores of a SNP in two diseases had different 
signs, we assumed that the allele was protective in one 
disease and deleterious in the other. As an extension of this 
allele-specific comparison, we computed the Pearson pair- 
wise correlation between the GVS data of two diseases. This 
correlation analysis enabled us to determine whether large 
numbers of SNPs have similar effects in two different 
diseases, that is, a high positive correlation coefficient 
indicated similar effects, whereas a strong negative 
coefficient indicated opposing effects of the risk alleles. 

Although both of these two approaches were based on 
individual SNP association and framed in terms of odds-ratios 
and P-values, they provided two complementary assess- 
ments of the association between the genotypes of two 
different diseases. The PS test assigned a score per individual 
in the target disease set and assumed that many weakly 
associated SNPs with marginal odds-ratios may work 
collectively, such that they have stronger association test 
statistics than loci drawn from the null distribution. The GVS 
and its Pearson correlation as a similarity measure examined 
whether there were SNPs with strong significance in both 



Results 

PS analysis. We examined whether groups of ASD- 
associated SNPs can collectively account for genomic 
variation in another disease, even when each individual 
SNP may not have a very strong effect. From this source 
sample, we selected sets of marginally associated alleles at 
five different P-value thresholds (Pj <0.01, 0.05, 0.1, 0.25 
and 0.5). After calculating PS per each individual in each 
target autoimmune data set, we performed a logistic regres- 
sion analysis using the PS as a predictor of target disease 
classification, and estimated the variance in the target 
sample explained by the PS using Nagelkerke's pseudo FF. 
Table 1 summarizes this measure with P-value thresholds 
for each group of source alleles and target diseases. Source 
alleles from ASD data showed a significant enrichment in AS 
cases (P=2.22 X 10"''^ at Pt<0.5), explaining about two 
percent of variance. In addition, the effect of source alleles 
became larger when the P-value threshold Pj was increased, 
supporting our hypothesis that alleles with little individual 
effect in one disease may work collectively in other diseases. 
In MS, source alleles showed a similar pattern of increasing 
with increasing Pj thresholds, explaining about one 
percent of variance (P=5.16 x 10"^ at Pj < 0.5). In contrast 
to AS, however, the coefficients of the PS in MS were all 
negative and the average mean score was higher in controls 
(data not shown), suggesting that ASD risk alleles have 
opposite overall effects in these two diseases. In other 
WTCCC target diseases, the variance explained by ASD 
source alleles was either too small {FF < 0.01), or was not 
significant (P>0.05) over all P-value thresholds. 

Genetic variation score analysis. Next, we examined 
whether allele-specific associations can be found among 
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Table 1 Logistic regression analysis results of ASD and WTCCC disease data 



Source: ASD Target sample pseudo-R^ x 100 p value) 



Pr 


0.01 




0.05 


0.1 


0.25 


0.5 


WTCCC 4 
AS 
ATD 
BC 
MS 


0.21 (1.44e-07) 
0.01 (1 .99e-04) 
0.00 (2.29e-11) 
0.07(2.796-01) 




0.39 (2.146-09) 
0.17 (9.516-05) 
0.10 (9.096-11) 
0.17 (8.496-01) 


1.04 (5.276-11) 
0.35 (1 .266-06) 
0.01 (6.756-11) 
0.76 (8.816-04) 


1.59 (1.046-13) 
0.31 (6.646-08) 
0.03 (8.886-15) 
0.90 (7.56e-05) 


2.28 (2.226-16) 
0.33(1.726-08) 
0.02 (1.116-16) 
1.05 (5.166-05) 


WTCCC 7 
BD 
CAD 
CD 
HT 
RA 
T1D 
T2D 


0.33 (1.546-02) 
0.01 (9.276-01) 
0.11 (2.646-01) 
0.03 (7.106-01) 
0.05 (5.696-01) 
0.11 (2.346-01) 
0.04 (6.346-01) 




0.38 (8.526-03) 
0.38 (8.366-03) 
0.17 (1.736-01) 
0.11 (2.466-01) 
0.15 (4.156-02) 
0.16 (1.296-01) 
0.18 (1.016-01) 


0.08 (3.706-01) 
0.12 (2.176-01) 
0.22 (7.336-02) 
0.02 (7.876-01) 
0.25 (1.246-01) 
0.21 (6.996-02) 
0.04 (5.756-01) 


0.25 (4.496-02) 
0.04 (6.126-01) 
0.25 (2.466-01) 
0.09 (8.296-01) 
0.17 (8.836-02) 
0.38 (7.326-03) 
0.05 (7.226-01) 


0.16(1.426-01) 
0.06 (3.906-01) 
0.20 (5.016-02) 
0.11 (8.686-01) 
0.22 (5.166-02) 
0.29 (1.586-02) 
0.03 (8.646-01) 


Abbreviations: AS, ankylosing spondylitis; ASD, autism spectrum disorder; ATD, autoimmune thyroid disease; BC, breast cancer; BD, bipolar disorder; CAD, 
coronary artery disease; CD, Crohn's disease; HT, hypertension; MS, muitipie sclerosis; Pj, P-value threshold; RA, rheumatoid arthritis; T1 D, type 1 diabetes; T2D, 
type 2 Diabetes; WTCCC, Weilcome trust case control consortium. 


Table 2 Pairwise GVS correlation coefficients between ASD and WTCCC data sets 










AS 


ATD 


BC 


MS 






GVS^ (#SNPs) 


0.4032(28) 0.2012(34) 0.0254(16) 


-0.3092 (36) 








BD 


CAD 


CD 


HT 


RA T1D 


T2D 


GVS^ (#SNPs) 


-0.0895(276) -0.0868(267) -0.0125(288) 


-0.0360 (250) 


-0.0588(258) 0.0127(240) 


-0.0546 (261) 



Abbreviations; AS, ankylosing spondylitis; ASD, autism spectrum disorder; ATD, autoimmune thyroid disease; BC, breast cancer; BD, bipolar disorder; CAD, 
coronary artery disease; CD, Crohn's disease; GVS, genetic variation score; HT, hypertension; MS, multiple sclerosis; RA, rheumatoid arthritis; SNPs, singie- 
nucleotide poiymorphisms; T1 D, type 1 diabetes; T2D, type 2 diabetes; WTCCC, Wellcome trust case control consortium. Number of the common SNPs in both 
diseases with P<0.05 is shown in paranthesis. 



ASD and autoimmune disease data sets via tine GVS 
approach. As we examined tlie collective effect of marginally 
significant SNPs and found enricliment in two autoimmune 
diseases in tlie PS approacli, iiere we focused our analysis on 
SNPs witli at least nominally significant P values (P < 0.05). 
Multiple liypothesis correction was not applied to tlie P values 
when calculating GVS because we examined associations of 
disease pairs by correlation coefficients, which would not 
change after multiple hypothesis correction. Table 2 shows 
the Pearson correlation coefficients for all disease pairs within 
ASD and the WTCCC data sets, calculated with GVS. 
Consistent with the results from the PS analysis above, the 
ASD data exhibited significant positive association with AS 
(coefficient 0.4032) and significant negative association with 
MS (coefficient -0.3092). AS and MS were also strongly 
negatively associated with one another (coefficient -0.3092). 
Interestingly, the coefficient of association between ASD and 
AS was higher than any of the autoimmune disease pairs, 
while the strength of the association between ASD and 
MS was comparable to that of strengths of association 
between any pair of autoimmune disorders. Autoimmune 
thyroid disease was slightly positively correlated with ASD 
(coefficient 0.2012), whereas all other autoimmune diseases 
(MS, Crohn's disease and rheumatoid arthritis) showed little 
association with ASD in their profiles. Table 3 summarizes the 
nominally significant SNPs in both ASD and the autoimmune 
diseases, including AS and MS. Many of these SNPs fell within 



already known ASD risl< genes (for example, rs2034648 — 
AGAP1) or mental disorder susceptibility genes (for example, 
rsll 64371 8-SLC12A3 and rs3132468-MICB). Another group 
of SNPs fell within the protein coding regions of several known 
autoimmune risk genes (for example, rs3130559-PSORS1C1 , 
rs3129943-C6orf10 and rs3130542-HLA-C). However, none 
of these have been reported previously as ASD susceptibility 
genes. 

Discussion 

Given that the magnitude of the association between ASD and 
the two autoimmune diseases, AS and MS, was either greater 
than or on par with the strength of association between what 
are considered now to be genetically similar autoimmune 
diseases, coupled with the lack of any other association of the 
same significance between ASD and the remaining auto- 
immune disorders examined, our results clearly demonstrated 
that there are true genomic links between ASD and the two 
autoimmune diseases, links that likely can inform our under- 
standing of the genetics and treatments of ASD. However, 
further study and verification is required to characterize and 
explain these particular genomic associations. An interesting, 
albeit anecdotal, similarity between ASD and AS is that they 
both have an appreciable male bias. AS has a male-to-female 
ratio of approximately 2.5:1"*^ that is of the same magnitude as 
the male bias of ASD's 4:1"*^ while other autoimmune 
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Table 3 GVS comparison with ASD, AS, ATD and MS 



Chr. 


SNPId 


Gene 


ASD 


AS 


ATD 


MS 


1 


rs 11583896 




-1.35 


-1.94 


-1.71 




2 


rs2034648 


AGAP1 


-1.89 




1.42 


2.67 


4 


rsl 1721758 




-1.73 


-1.79 


-1.46 


-1.58 


6 


rs31 30559 


PS0RS1C1 


1.72 


5.59 


-3.54 


-2.03 


6 


rs31 32468 


MICB 


-1.89 


-16.48 


-1.61 


6.16 


6 


rs2844494 




-1.64 


-14.87 


-5.01 


3.19 


6 


rs4678 


VARS2 


-1.72 


-5.22 


8.78 


-1.38 


6 


rs926070 




-1.39 


-2.21 


-3.02 


26.67 


6 


rs9263871 




1.36 


-10.11 


-2.28 


-1.67 


6 


rs31 29943 


C6orf10 


1.38 


-3.57 


11.43 


-3.02 


6 


rs2239705 


ATP6V1G2 


-1.36 


7.31 


-1.79 




6 


rs6905949 


TRIM15 


-1.44 


-3.83 


-1.65 




6 


rs4248154 


LOCI 00294091 


1.87 


6.90 




-2.14 


6 


rs31 30542 


HLA-C 


-2.13 


-12.74 




7.94 


6 


rs2071596 


BAT1 


1.97 


45.76 




-2.42 


6 


rs2050189 


C6or110 


1.53 




14.09 


-1.86 


6 


rs2256965 




-1.54 




-2.96 


5.42 


6 


rs928815 


LOCI 00289233 


-1.66 




-3.54 


4.84 


16 


rsl 1643718 


SLC12A3 


2.59 




1.48 


1.97 


17 


rs479231 1 




1.51 


1.48 


1.39 


1.47 


17 


rs2271233 


TEKT1 


1.93 




-3.12 


-2.14 


19 


rsl 5591 55 


PPFIA3 


1.74 


1.31 




1.73 


19 


rs7258236 


SH2D3A 


-1.95 


-1.79 




-1.79 



Abbreviations: AS, anl<ylosing spondylitis; ASD, autism spectrum disorder; 
ATD, autoimmune thyroid disease; Chr., chromosome; GVS, genetic variation 
score; MS, multiple sclerosis; SNP, single-nucleotide polymorphism. 



diseases, most importantly including IVIS, ' generally show 
higher susceptibility in females. Also of interest to the 
observed genomic similarities between ASD and AS is anti- 
TNF (tumor necrosis factor) alpha treatment therapy. Anti- 
TNF agents are among the most efficacious options for 
treatment of AS,'*^ '*'' and, although not well studied to date, 
ASD cases have been shown to have increased expression 
levels of TNF-alpha and IL-S,"*® and separate cases have 
received anti-TNF therapy."*® Interestingly, anti-TNF agents 
tend to cause demyelination as an adverse side effect, ^° a 
symptom that is typical in IVIS. 

To rule out the possibility that the ratio of males to females in 
our original data sets biased our findings, we constructed sex- 
balanced and gender-specific data sets and recomputed the 
polygenic scores. The direction and relative strength of 
correlation with AS and IVIS remained unchanged, indicating 
that our results are not unduly influenced by the different 
numbers of male and female cases in the data sets. To rule 
out the possibility that differences in population heterogeneity 
and ethnic background between the ASD and the autoimmu- 
nity data collections biased the findings, we tested for 
association using only ASD individuals of European ancestry 
to match the ancestry of the individuals included within the 
autoimmune data sets. With a reduced number of 1019 
affected trios from the ASD data set, we conducted TDT 
analysis and PS testing and found results consistent with 
those found using the complete ASD data set, a positive 
association with AS and negative association with MS and no 
other significant associations between ASD and the remaining 
autoimmune disorders studied. These results suggest that 
differences in mixed ancestry between the two data sets did 
not influence the associations discovered. However, this lack 
of bias from mixed ancestry does not preclude the possibility 
that geographical differences (our ASD collection was from 



the United States, while autoimmune collections were from the 
United Kingdom), and consequent differences in environmen- 
tal exposure, could influence risk for disease differently as 
has been shown in the MS cases. However, differences 
in exposure related to geography are more likely to cause 
underestimates of association rather than overestimates and 
are thus unlikely to alter the results shown here. An additional 
potential bias could arise from the age of onset of autoimmune 
conditions. For example, MS tends to manifest in women 
during childbearing ages. Our ASD sample contains mothers 
who have, by definition, passed childbearing age, but because 
the mothers served as controls in the association testing, the 
observed negative association between ASD and MS sug- 
gests that mothers in the ASD set had a higher loading of 
MS-related alleles, not a lower loading. Under this assumption, 
it is unlikely that differences in ages of the sampled populations 
biased the significance or directionality of the association 
identified. Nevertheless, further studies with more equivalent 
samples across parameters, including age, ancestry and 
geography would be valuable to verify our findings. 

In conclusion, we found significant, allele-specific genomic 
associations between ASD and two autoimmune diseases, 
AS and MS, which were supported using two complementary 
analytical strategies. The first, a PS approach, revealed that a 
collection of relatively weakly associated ASD susceptibility 
markers could still explain a significant percentage of the 
variation in both AS and MS cases. Coefficients from logistic 
regression analysis with the polygenic scores showed that 
the collective, allele-specific role of SNPs in ASD and AS 
was similar, whereas the roles of the alleles explaining the 
similarity between ASD and MS were of opposite effect, 
conferring risk in one and protection from onset in the other. 
The second, a genetic variation score approach, found the 
same results of allele-specific association between ASD and 
AS and ASD and MS with comparable or higher strength of 
association than that found between any autoimmune disease 
pair. Together these results suggest that common genetic 
mechanisms exist between ASD and AS and that opposing 
genetic mechanisms exist between ASD and MS. Both 
approaches pinpoint sets of SNPs that comprise the 
significant associations seen in our study and that may be of 
value as targets for further experiments aimed at under- 
standing the genetic ties between ASD and autoimmune 
diseases. 
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